-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(responses)!: add Prompts API to Responses API #3514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(responses)!: add Prompts API to Responses API #3514
Conversation
ef753bc
to
fe6ea4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an api change unrelated to how prompts are used in /v1/responses
please review your code assistant output before posting as a PR.
Hi @mattf ! Could you please elaborate on how the prompts should be used in Responses API in your opinion. My understanding was that they should be propagated to Agent’s messages context as OpenAISystemMessageParam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @r3v5 it looks like you've suggested adding prompt_id
here where you need to add a Prompt
object with an id
, version
, and variables
, which would then be consistent with OpenAI's client usage, as outlined here:
response = client.responses.create(
prompt={
"id": "pmpt_68b0c29740048196bd3a6e6ac3c4d0e20ed9a13f0d15bf5e",
"version": "2",
"variables": {
"city": "San Francisco",
"age": 30,
}
}
)
So this is currently incorrect. As @mattf suggested, let's make sure we double check this. Thank you.
Oh yeah, this makes sense. I got it. I will adjust the implementation then |
fe6ea4c
to
a3cdf78
Compare
""" | ||
|
||
id: str | ||
version: str | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version has type string because OpenAI has it like string. Reference is here
@cdoern this is an enhancement to the /openai/v1/responses api, does it match the openai /v1/responses api spec? |
there seems to be some breaking changes, BUT these might have existed in main, let me check. |
a3cdf78
to
d76b15b
Compare
hey @cdoern any update on the main branch check here? |
fadf1d0
to
f474e0c
Compare
4175600
to
f37efb1
Compare
Yeah, I have looked at it already. I guess we need to settle down with your implementation to enable internal deps for agents. I rebased from main already. Right now I enhance other parts of my code |
@franciscojavierarceo do you think we need to support vars {{ name}}, { { name }} apart from default behaviour like {{name}}? |
I'm not sure, I'd recommend testing those edge cases with OpenAI and seeing their behavior. We should match theirs. |
My understanding that they use only placeholders like |
7f7a3df
to
0dcb4df
Compare
# Replace the variables in the prompt text | ||
# Support all whitespace variations: {{name}}, {{ name }}, {{ name}}, {{name }}, etc. | ||
for name, value in prompt_params.variables.items(): | ||
pattern = r"\{\{\s*" + re.escape(name) + r"\s*\}\}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented support for different types of placeholders for variables in prompt. As I said, in OpenAI docs they mention only one type of placeholder (eg. {{variable}}
), but I had a chance to create a free prompt in OpenAI playground and it seems they aren't so strict in terms of placeholders. Therefore, at least for now I allowed support for different types of placeholders. Also, I tested it in unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you!
0dcb4df
to
82be394
Compare
I wonder if we need to handle variables with Response input message types like
|
a3aba30
to
88b8d3b
Compare
I implemented support for two of those as well. Can you, guys PTAL? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Can you share a workflow example using the Prompt API and re-use that prompt in Responses create? Thanks!
Path.home() / ".llama" / "files" / path, | ||
Path.cwd() / "files" / path, | ||
Path("/tmp/llama_stack_files") / path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to do this? Don't we have a source of truth for that location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I can add Files API as dependency to OpenAIResponsesImpl
and this will allow to easily get the actual path of the file without this workaround. What do you think, @leseb ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try this approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I had replied to see sorry, yes please go ahead with this approach coz the current one feels hacky
|
||
|
||
@json_schema_type | ||
class OpenAIResponsePromptParam(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we use the type from llama_stack/apis/prompts/prompts.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prompt
BaseModel in Prompts API has string prompt
param and boolean is_default
param while OpenAIResponsePromptParam
contains exactly three params like id
, variables
and version
like in OpenAI scheme. IMO, having two separate JSON scheme is good in terms of separation of concerns and it allows to achieve full OpenAI compatibility (reference can be found here)
@json_schema_type
class Prompt(BaseModel):
"""A prompt resource representing a stored OpenAI Compatible prompt template in Llama Stack.
:param prompt: The system prompt text with variable placeholders. Variables are only supported when using the Responses API.
:param version: Version (integer starting at 1, incremented on save)
:param prompt_id: Unique identifier formatted as 'pmpt_<48-digit-hash>'
:param variables: List of prompt variable names that can be used in the prompt template
:param is_default: Boolean indicating whether this version is the default version for this prompt
"""
prompt: str | None = Field(default=None, description="The system prompt with variable placeholders")
version: int = Field(description="Version (integer starting at 1, incremented on save)", ge=1)
prompt_id: str = Field(description="Unique identifier in format 'pmpt_<48-digit-hash>'")
variables: list[str] = Field(
default_factory=list, description="List of variable names that can be used in the prompt template"
)
is_default: bool = Field(
default=False, description="Boolean indicating whether this version is the default version"
)
@json_schema_type
class OpenAIResponsePromptParam(BaseModel):
"""Prompt object that is used for OpenAI responses.
:param id: Unique identifier of the prompt template
:param variables: Dictionary of variable names to OpenAIResponseInputMessageContent structure for template substitution
:param version: Version number of the prompt to use (defaults to latest if not specified)
"""
id: str
variables: dict[str, OpenAIResponseInputMessageContent] | None = None
version: str | None = None
Thanks, @leseb ! Basically, in PR description I have attached the testing workflow that shows how prompts work in Responses API. But it involves only basic prompts with text variables. Let me know if you want me to show working example with input image and input file variables. |
@r3v5 yeah additional tests/validation would be great with a working example. |
The PR description shows a working example of Prompt inside the Response create, but I'd like to see |
d63e31f
to
b954305
Compare
@r3v5 unit tests are failing |
I just rebased commit from main today. Still haven't finished my implementation |
b954305
to
1660935
Compare
1660935
to
7a7b2b7
Compare
Hey @leseb , @franciscojavierarceo ! Here I provide a comprehensive testing of support Prompts in Responses API via curl requests to LLS server. Test Prompts with Images with text on them in Responses API: I used this image for testing purposes: iphone 17 image
Output after inferencing:
The same example but without providing the description of product:
Output:
Test Prompts with PDF files in Responses API: I used this PDF file for testing purposes: invoicesample.pdf
Output after inferencing:
Test simple text Prompt in Responses API:
Output after inferencing:
|
The implementation is there :) |
What does this PR do?
The purpose of this PR is to integrate Prompts API to Responses API to achieve full OpenAI compatibility for current Responses API in Llama Stack.
Closes #3321
Test Plan
Manual API testing and running newly added unit tests.
Prerequisites:
uv run --with llama-stack llama stack build --distro starter --image-type venv --run
API Testing:
Output:
{"created_at":1758562266,"error":null,"id":"resp-f20d3abe-d3cf-416b-9060-e653134d51cf","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_67a2f184-ae00-4f49-b4d2-27dfd22623a6","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"You are a helpful {{ area_name }} assistant in {{ company_name }}. Always provide accurate information.","version":2,"prompt_id":"pmpt_dc6c124c7f1393cd4ddb88ed707ffbfd3d937e644d10052c","variables":["area_name","company_name"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"truncation":null,"user":null}%
Telemetry log (only for testing purposes, disabled in final version of the code):
Output:
{"created_at":1758562266,"error":null,"id":"resp-f20d3abe-d3cf-416b-9060-e653134d51cf","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_67a2f184-ae00-4f49-b4d2-27dfd22623a6","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"prompt":"You are a helpful {{ area_name }} assistant in {{ company_name }}. Always provide accurate information.","version":2,"prompt_id":"pmpt_dc6c124c7f1393cd4ddb88ed707ffbfd3d937e644d10052c","variables":["area_name","company_name"],"is_default":false},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"truncation":null,"user":null}%